independence model
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > Singapore (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > Singapore (0.04)
Exact and Approximate MCMC for Doubly-intractable Probabilistic Graphical Models Leveraging the Underlying Independence Model
Chen, Yujie, Chakraborty, Antik, Bhadra, Anindya
Bayesian inference for doubly-intractable probabilistic graphical models typically involves variations of the exchange algorithm or approximate Markov chain Monte Carlo (MCMC) samplers. However, existing methods for both classes of algorithms require either perfect samplers or sequential samplers for complex models, which are often either not available, or suffer from poor mixing, especially in high dimensions. We develop a method that does not require perfect or sequential sampling, and can be applied to both classes of methods: exact and approximate MCMC. The key to our approach is to utilize the tractable independence model underlying an intractable probabilistic graphical model for the purpose of constructing a finite sample unbiased Monte Carlo (and not MCMC) estimate of the Metropolis--Hastings ratio. This innovation turns out to be crucial for scalability in high dimensions. The method is demonstrated on the Ising model. Gradient-based alternatives to construct a proposal, such as Langevin and Hamiltonian Monte Carlo approaches, also arise as a natural corollary to our general procedure, and are demonstrated as well.
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- North America > United States > Indiana (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
- Media > Film (0.69)
- Leisure & Entertainment (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)
Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets
Bang, Christine W., Didelez, Vanessa
In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information could not be measured at all, or not jointly, as in the case of multiple overlapping datasets. We first present novel insights into the properties of the 'tiered FCI' (tFCI) algorithm. Building on this, we introduce a new extension of the IOD (integrating overlapping datasets) algorithm incorporating tiered background knowledge, the 'tiered IOD' (tIOD) algorithm. We show that under full usage of the tiered background knowledge tFCI and tIOD are sound, while simple versions of the tIOD and tFCI are sound and complete. We further show that the tIOD algorithm can often be expected to be considerably more efficient and informative than the IOD algorithm even beyond the obvious restriction of the Markov equivalence classes. We provide a formal result on the conditions for this gain in efficiency and informativeness. Our results are accompanied by a series of examples illustrating the exact role and usefulness of tiered background knowledge.
- Europe > Germany > Bremen > Bremen (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
An Asymmetric Independence Model for Causal Discovery on Path Spaces
Manten, Georg, Casolo, Cecilia, Mogensen, Søren Wengel, Kilbertus, Niki
We develop the theory linking 'E-separation' in directed mixed graphs (DMGs) with conditional independence relations among coordinate processes in stochastic differential equations (SDEs), where causal relationships are determined by "which variables enter the governing equation of which other variables". We prove a global Markov property for cyclic SDEs, which naturally extends to partially observed cyclic SDEs, because our asymmetric independence model is closed under marginalization. We then characterize the class of graphs that encode the same set of independence relations, yielding a result analogous to the seminal 'same skeleton and v-structures' result for directed acyclic graphs (DAGs). In the fully observed case, we show that each such equivalence class of graphs has a greatest element as a parsimonious representation and develop algorithms to identify this greatest element from data. We conjecture that a greatest element also exists under partial observations, which we verify computationally for graphs with up to four nodes.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (2 more...)
On Different Notions of Redundancy in Conditional-Independence-Based Discovery of Graphical Models
Faller, Philipp M., Janzing, Dominik
The goal of conditional-independence-based discovery of graphical models is to find a graph that represents the independence structure of variables in a given dataset. To learn such a representation, conditional-independence-based approaches conduct a set of statistical tests that suffices to identify the graphical representation under some assumptions on the underlying distribution of the data. In this work, we highlight that due to the conciseness of the graphical representation, there are often many tests that are not used in the construction of the graph. These redundant tests have the potential to detect or sometimes correct errors in the learned model. We show that not all tests contain this additional information and that such redundant tests have to be applied with care. Precisely, we argue that particularly those conditional (in)dependence statements are interesting that follow only from graphical assumptions but do not hold for every probability distribution.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Europe > Czechia > Prague (0.04)
Uncovering Meanings of Embeddings via Partial Orthogonality
Jiang, Yibo, Aragam, Bryon, Veitch, Victor
Machine learning tools often rely on embedding text as vectors of real numbers. In this paper, we study how the semantic structure of language is encoded in the algebraic structure of such embeddings. Specifically, we look at a notion of ``semantic independence'' capturing the idea that, e.g., ``eggplant'' and ``tomato'' are independent given ``vegetable''. Although such examples are intuitive, it is difficult to formalize such a notion of semantic independence. The key observation here is that any sensible formalization should obey a set of so-called independence axioms, and thus any algebraic encoding of this structure should also obey these axioms. This leads us naturally to use partial orthogonality as the relevant algebraic structure. We develop theory and methods that allow us to demonstrate that partial orthogonality does indeed capture semantic independence. Complementary to this, we also introduce the concept of independence preserving embeddings where embeddings preserve the conditional independence structures of a distribution, and we prove the existence of such embeddings and approximations to them.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Asia > Singapore (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
The m-connecting imset and factorization for ADMG models
Andrews, Bryan, Cooper, Gregory F., Richardson, Thomas S., Spirtes, Peter
Directed acyclic graph (DAG) models have become widely studied and applied in statistics and machine learning -- indeed, their simplicity facilitates efficient procedures for learning and inference. Unfortunately, these models are not closed under marginalization, making them poorly equipped to handle systems with latent confounding. Acyclic directed mixed graph (ADMG) models characterize margins of DAG models, making them far better suited to handle such systems. However, ADMG models have not seen wide-spread use due to their complexity and a shortage of statistical tools for their analysis. In this paper, we introduce the m-connecting imset which provides an alternative representation for the independence models induced by ADMGs. Furthermore, we define the m-connecting factorization criterion for ADMG models, characterized by a single equation, and prove its equivalence to the global Markov property. The m-connecting imset and factorization criterion provide two new statistical tools for learning and inference with ADMG models. We demonstrate the usefulness of these tools by formulating and evaluating a consistent scoring criterion with a closed form solution.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania (0.04)
- Europe > Czechia > Prague (0.04)
Conditions and Assumptions for Constraint-based Causal Structure Learning
The paper formalizes constraint-based structure learning of the "true" causal graph from observed data when unobserved variables are also existent. We define a "generic" structure learning algorithm, which provides conditions that, under the faithfulness assumption, the output of all known exact algorithms in the literature must satisfy, and which outputs graphs that are Markov equivalent to the causal graph. More importantly, we provide clear assumptions, weaker than faithfulness, under which the same generic algorithm outputs Markov equivalent graphs to the causal graph. We provide the theory for the general class of models under the assumption that the distribution is Markovian to the true causal graph, and we specialize the definitions and results for structural causal models.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (5 more...)
Maximum Entropy Based Significance of Itemsets
We consider the problem of defining the significance of an itemset. We say that the itemset is significant if we are surprised by its frequency when compared to the frequencies of its sub-itemsets. In other words, we estimate the frequency of the itemset from the frequencies of its sub-itemsets and compute the deviation between the real value and the estimate. For the estimation we use Maximum Entropy and for measuring the deviation we use Kullback-Leibler divergence. A major advantage compared to the previous methods is that we are able to use richer models whereas the previous approaches only measure the deviation from the independence model. We show that our measure of significance goes to zero for derivable itemsets and that we can use the rank as a statistical test. Our empirical results demonstrate that for our real datasets the independence assumption is too strong but applying more flexible models leads to good results.
- North America > United States > District of Columbia > Washington (0.14)
- Europe > Finland > Uusimaa > Helsinki (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)